Rank in Wordlist | Frequency | Word |
---|---|---|
2721 | 36 | 1,5 |
3571 | 26 | 2,5 |
5952 | 14 | 0,5 |
6314 | 13 | 1,2 |
7185 | 11 | 3,5 |
7727 | 10 | 1,3 |
7728 | 10 | 1,7 |
8404 | 9 | 1,1 |
8405 | 9 | 1,6 |
8406 | 9 | 1,8 |
Rank in Wordlist | Frequency | Word |
---|---|---|
26699 | 2 | ANGELES(VG |
36401 | 2 | lever(er |
41199 | 1 | -(dårligare |
43114 | 1 | A(H1N1 |
49095 | 1 | Gjemnes(stekningen |
50242 | 1 | Hijab(Islamistisk |
51541 | 1 | Juliussen(63 |
51937 | 1 | Kindle(et |
52932 | 1 | LOVE)(LOVE)(LOVE)(LOVE)(LO |
58610 | 1 | Sanden(Ap |
Rank in Wordlist | Frequency | Word |
---|---|---|
42404 | 1 | 24)er |
48390 | 1 | FrP)reagerer |
52491 | 1 | KrF)Foto |
52932 | 1 | LOVE)(LOVE)(LOVE)(LOVE)(LO |
55461 | 1 | Nett)En |
55734 | 1 | Norge)se |
56557 | 1 | POLITICS)/Scanpix/Reuters |
58507 | 1 | Salhusbrua)over |
64071 | 1 | alternetiv)løsning |
64533 | 1 | arkivbilde)Foto |
Rank in Wordlist | Frequency | Word |
---|---|---|
8437 | 9 | B&R |
20107 | 3 | C&C4 |
27838 | 2 | FT&IF |
29268 | 2 | L&M |
29538 | 2 | M&R |
43199 | 1 | AT&T |
44151 | 1 | B&Rs |
46674 | 1 | E&Y |
49555 | 1 | H&M |
51250 | 1 | J.O.N&gjengen |
Rank in Wordlist | Frequency | Word |
---|---|---|
51854 | 1 | Ke$ha |
Rank in Wordlist | Frequency | Word |
---|---|---|
26775 | 2 | Alkis"-bot |
43686 | 1 | Angels"-skuespilleren |
44342 | 1 | Bane-stengt"-skiltet |
45420 | 1 | Budeie"-monumentet |
45861 | 1 | City"-katastrofen |
50764 | 1 | I"-prisvinner |
50832 | 1 | IT"-sjef |
54135 | 1 | Marina"s |
54215 | 1 | Mathilde"l |
58029 | 1 | Rottenetter"- |
Rank in Wordlist | Frequency | Word |
---|---|---|
3017 | 32 | Assassin's |
6398 | 13 | Mirror's |
14331 | 5 | d'Italia |
16457 | 4 | Lekter'n |
16548 | 4 | N'jie |
20744 | 3 | Jackson's |
20773 | 3 | Jonny's |
20992 | 3 | Let's |
26802 | 2 | America's |
27510 | 2 | Dante's |
Rank in Wordlist | Frequency | Word |
---|---|---|
62406 | 1 | Utredning+utredning+utredning |
Rank in Wordlist | Frequency | Word |
---|---|---|
1766 | 59 | Bodø/Glimt |
2775 | 36 | km/t |
4925 | 18 | Tips/bilder |
7750 | 10 | AC/DC |
7860 | 10 | Junge/Scanpix |
7928 | 10 | Scanpix/AFP |
9413 | 8 | Larsen/Scanpix |
9508 | 8 | Sandvik/NRK |
10568 | 7 | Poppe/Scanpix |
11778 | 6 | Håkedal/Tørlen |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots